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ABSTRACT: In numerous studies, virtual training for construction safety has been proposed as a promising 
approach. However, creating realistic training scenarios requires significant resources, encompassing various 
elements such as sound, graphics, agent behavior, and realistic hazards. Digital Twins have revolutionized this 
process, and although so far, on a conceptual level only, significantly reducing the associated workload, it is still 
not exploiting its full potential. In this work, we propose a novel approach that leverages Real-time Location 
Systems (RTLS) data to simulate the real-world behavior of construction workers and equipment within Virtual 
Training Environments (VTEs). We aim to create training scenarios with dynamic real-world instead of hardcoded 
made-up hazardous events. To achieve this, we propose an extension to our Digital Twin for Construction Safety 
(DTCS) framework that now integrates (a) trajectory data streams of construction personnel and equipment and 
(b) technical specifications of the construction site work environment, including location and geometry of terrain 
and surface objects, to simulate real-world hazards in virtual safety training scenarios. Our further contribution 
is a case study application to explore the DTCS training capacity. Applying a logical filtering algorithm, we can 
process the RTLS data and ensure that the movements of the workers and equipment within the virtual environment 
are as realistic and representative as within the real world. This then enables the creation of realistic hazards that 
trainees can encounter in the training phase. Preliminary results with trainees suggest that the proposed work can 
have a high potential to enhance the realism of safety training, especially when they need to experience human- 
machine-related interactions safely. However, further work is required to create more responsive learning 
environments where the equipment follows real trajectories but also responds intelligently to the trainees' actions. 
By leveraging real-time data and advanced visualization technologies, we bridge the gap between the physical 
and virtual realms, enabling trainees to interact and navigate within a realistic virtual environment. 
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1. INTRODUCTION 


The numbers of occupational injuries and fatalities in the construction industry remain high despite significant 
investments in safety measures. Working on construction sites is one of the most fatal workplaces in the United 
States (BLS, 2022). The industry has introduced several approaches to increase safety. Generally, they can be 
separated into three categories: (1) Prevention through design and planning, (2) Right-time intervention, and (3) 
Prevention through training and education. Training in simulated environments has become more popular over the 
last few years. Among others, an advantage of virtual training is that the trainee can practice tasks in a safe 
environment without hazards where mistakes cannot lead to injuries. Several studies investigated virtual training 
for construction safety using Virtual Reality (VR) with head-mounted displays (Fang et al., 2014; Hilfert et al., 
2016; Wolf et al., 2019; Jacobsen et al., 2022; Sacks et al., 2013; Jelonek et al., 2022), or desktop-based virtual 
training (Speiser & Teizer, 2023a). More recently, Bükrü et al. (2019) and Wolf et al. (2022) developed the concept 
of Augmented Virtuality (AV) in construction safety training. Noteworthy in their study is the use of real hand- 
powered tools to generate haptic control and feedback in a virtual learning environment made for construction 
trainees and not necessarily anymore for academic student participants. These studies highlight that it requires 
significant efforts to create such training environments independently of the used technology. To ease the creation 
of the training environment, Golovina et al. (2019a) use Building Information Modelling (BIM), and our previous 
research introduced a data model for a digital twin, indicating a significant decrease in resources for generating 
the training scenes (Speiser & Teizer, 2023b). Still, most studies developed hard-coded scenarios where hazards 
are artificial. Hence, there is potential for more realism with less effort by including additional data sources and 
realistic hazards. 


Despite VR being adopted in various industries, the terminology remains ill-defined. VR is often associated with 
an immersive experience using head-mounted displays. Some studies consider desktop-based experiences as VR 
(Wang et al., 2018), while others speak of fully immersed systems that utilize a head-mounted display (Kim et al., 
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2017). Milgram et al. (1994) first introduced a continuum describing the different mixtures of reality and virtuality. 
In this continuum, VR defines an environment where only virtual elements exist. At the same time, Augmented 
Reality (AR) describes systems where most elements are real and a few are virtual. This continuum intended to 
classify Mixed Reality (MR) experiences with visual displays. However, since 1994, several studies have 
developed other kinds of displays to include haptics (Azmandian et al., 2016), multidimensional sound (Savioja 
& Svensson, 2015), and scent (Yanagida, 2012). Integrating such displays creates a new level of immersion. 


Based on the latest developments, Skarbez et al. (2021) revisited the continuum from Milgram et al. (1994) and 
proposed new indicators considering the latest developments in new technologies: (1) the Extent of World 
Knowledge (EWK), (2) IMmersion (IM), and (3) COherence (CO). The system's immersion defines the level of 
immersive feedback to actions. Skarbez et al. (2021) conclude that a fully immersive system must realistically 
respond to all human senses. EWK describes what objects are part of the virtual experience and how they are 
represented. Nowadays, the Internet of Things (IoT) can provide advanced information and replicate real elements 
in virtuality more accurately. CO indicates how consistently the system reacts to the users' intentions. Game 
engines provide functionalities such as realistic lighting or gravity to make virtual environments coherent. 


As mentioned before, creating virtual training in construction safety is time-consuming and requires realistic 
hazards. Much of the work is dedicated to developing realistic training scenarios where machines perform realistic 
tasks and move accordingly to represent realistic hazards. IoT can provide such real-world data, and game engines 
provide real-world physics to achieve coherent experiences. While previous studies have used game engines for 
creating virtual experiences, no previous work integrated real-world hazards from IoT devices. This study proposes 
a novel method for bridging this gap by streaming data from IoT devices into a Virtual Training Environment 
(VTE). The objective is to create more realistic training scenarios with hazards from real-world data. The method 
increases the EWK as well as the CO of these systems. The remainder of this paper describes the relevant research 
gap and introduces the framework integrating IoT devices. Second, a case study validates the proposed method 
using two training scenarios before summarizing the results and concluding with future work. 


2. RELATED WORK AND IDENTIFIED RESEARCH GAP 


While virtual training for construction safety has shown promise in improving workers' safety awareness (Adami 
et al., 2023), a significant research gap exists. The current state of virtual training for construction workers lacks 
the integration of real-world data from IoT devices, which hinders higher levels of realism in the training 
experiences. A literature review revealed that studies have explored the use of game engines for simulating real- 
world physics (Juang et al., 2011), BIM (Golovina & Teizer, 2022), and digital twins (Speiser & Teizer, 2023a; 
Teizer et al., 2024) to create virtual training scenarios for construction safety. These approaches have enabled the 
development of more interactive and immersive training environments, allowing trainees to practice tasks safely. 
However, to date, no previous work has integrated real-world data obtained through IoT into virtual safety training 
to expose the trainees to realistic hazards despite the potential benefits (Salinas et al., 2022; Zoleykani et al., 2023). 


Several studies on Real-Time-Location-Systems (RTLS) exist in construction as they can monitor the precise 
location of objects. Park et al. (2017) detected hazard exposure in workers using Bluetooth Low Energy (BLE), a 
technology enabling low-power communication between devices. Chae & Yoshida (2010) introduced an approach 
to prevent collisions with heavy construction machinery using radio-frequency identification (RFID). Teizer et al. 
(2008) used Ultra Wideband (UWB) for tracking construction resources and later for visualizing worker and gantry 
crane trajectory data in a first-of-a-time real-time VR learning environment for ironworker trainees (Teizer et al., 
2013). Narumi et al. (2018) stressed the applicability of Real-Time-Kinematic Global Navigation Satellite Systems 
(RTK-GNSS) for teleoperating construction equipment. 


This research bridges the research gap and unlocks crucial advantages by incorporating real-world data from RTLS 
devices into the VTEs. First, the IoT data will significantly increase the realism of the training simulations as the 
trainees virtually experience accurate information about the construction site conditions, worker locations, and 
equipment status. Second, the IoT data will enable the realistic reproduction of hazardous events for the trainee 
and, therefore, make the performance assessment more meaningful. Third, processing the data with appropriate 
algorithms will enhance the coherence. Construction sites are dynamic environments with numerous interacting 
elements. Incorporating data from IoT devices will allow the VTE to respond dynamically to changes in real-world 
conditions, thereby creating more coherent and contextually relevant training experiences. 
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In summary, the identified research gap is the lack of integrating real-world data from IoT devices into VTEs for 
construction safety. The proposed research aims to address this gap by developing a novel method that leverages 
RTLS data to enhance the realism, coherence, and practicality of virtual training scenarios, ultimately contributing 
to improved safety measures and reduced occupational injuries and fatalities in the construction industry. 


3. DIGITAL TWIN FRAMEWORK AND REAL-TIME DATA PROCESSING 


Figure 1 illustrates the proposed framework enabling real-time training for construction safety that is based on the 
DTCS proposed by Teizer et al. (2024) but focuses on components related to virtual training. 
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Fig. 1: Digital Twin framework integrating IoT data into virtual training environments. 
3.1 Input data 


The core of the framework is the VTE that utilizes the Project Status Knowledge (PSK) from the DTCS to virtually 
represent the construction site. The PSK contains the BIM model, a landscape model, the required resources, and 
a construction schedule. The PSK also encompasses a hazard library defining currently existing hazards such as 
restricted working areas, fall hazards, or moving machinery. These hazard zones geometrically describe areas 
where workers are in danger. Our previous work proposed a data model integrating hazards and safety regulations 
ina VTE (Speiser & Teizer, 2023b). The hazards can either be automatically detected by algorithms evaluating the 
PSK using safety regulations or can be modeled manually. This framework assumes that the VTE receives 
geometrically representable hazard zones. The landscape model contributes to realism by providing real 
surroundings. The requirements for the landscape model only concern the geometry for rendering purposes. For 
instance, a mesh from photogrammetry or laser scans may suffice. 


3.2 RTLS data processing 


The core novelty of this study represents the integration of real-world resources into the VTE to enhance realism 
through EWK and CO. Such resources include human workers, machinery, or materials. This work focuses on 
human workers and heavy machinery, such as wheel loaders or excavators. The representation of these resources 
in the virtual world utilizes geometrical descriptions as well as RTLS sensors to localize the resources in real-time. 
We expect the framework to function for all types of RTLS systems once the data quality is at a high level. 


RTLS data provides spatial-temporal information, which allows us to localize a resource at a timestamp. This 
information increases the EWK of the virtual environment as the virtual objects are placed at the real location. 
However, RTLS does not provide further knowledge about the state of the resource (e.g., orientation of the 
resource). Such knowledge is essential for MR experiences to generate coherent experiences. The real-time 
processing module generates knowledge about the state of a resource and simulates the motions realistically using 
technical specifications of individual resources, physics, and logic. 
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The technical specifications help to create a realistic object of the resource and load it into the MR scene. Besides 
the correct geometry, it also includes the possible motions of the resource. For instance, how does a machine steer, 
or can the machine move backward? The Real-time Processing module utilizes this information together with 
physics and logic statements to simulate the motions of the resources realistically. Physics integrates physical laws 
such as gravity or sound emissions, and the logic statements exclude irrational simulations. For instance, a railway 
wagon cannot rotate by 180° but changes direction, while a forklift may rotate by 180° without moving the location. 
The real-time processing module prioritizes realistic visualizations over the accuracy of the sensor data. We would 
rather misplace the element by a tolerance compared to the recorded data simulating abrupt motions. 


3.3 Output: Performance assessment and personalized feedback 


The performance analysis module collects data on how the trainee interacts with hazards throughout the training 
experience. Utilizing the hazard library based on safety regulations, this module constantly checks for violations 
of safety rules from the trainee. Once such violations are detected, further algorithms can evaluate the severity of 
the violation. For instance, Golovina et al. (2019b) introduced an approach to classify safety violations. To 
implement such a method in this framework, the digital twin must provide the location and geometry of hazard 
zones. Previous research has proposed virtual training for collecting such data (Golovina & Teizer, 2022). 


Once the training ends, the performance data is processed, and personalized feedback is generated. Personalized 
feedback to trainees has various benefits. Among others, learning has shown better efficiency when trainees 
understand what they did and how they can improve (Pianta et al., 2012). The feedback must summarise and assess 
the trainee's performance and graphically describe potential improvements. The feedback is also shared with the 
trainer, who can compare different performances. 


4. CASE STUDY 


To validate the proposed framework, we conducted an experiment in an infrastructure project in Munich, Germany. 
The study comprises five steps: (1) Collecting RTLS data from construction resources, (2) generating the game 
scene in Unity, (3) processing the RTLS data, (4) simulating the resources in the Unity scene using the processed 
data, and (5) evaluation of the simulation. The following sections describe how we tested the framework and 
finished with the required changes in order to provide (near) real-time training. 


4.1 Reality: RTLS data collection 


We collected the RTLS data at the staging area for a subway track replacement project in Munich. The collection 
lasted for three days during the early stage of the project. During the collection, we observed and tracked multiple 
tasks, such as unloading materials from the truck and arranging and loading materials onto rail cars. The tasks 
involved resources of both pedestrian workers and construction equipment. 


The RTK-GNSS solution was used for the location data collection as it performs accurately in outdoor 
environments. The RTK-GNSS solution consists of two components: a base station and rovers. The base station is 
placed statically in open space, and workers and equipment carry the rovers. Compared to a single GNSS solution 
whose accuracy is affected by atmospheric delays or clock errors, the RTK-GNSS uses the base station to provide 
correction for rovers so that the workers and equipment are located with cm-level accuracy (Wielgocka et al., 
2021). The accurate location information reduces the work of data processing and filtering when importing it into 
the training environment. In addition, it can cover a wider tracking area with a simple setup than other locating 
methods such as BLE or UWB. 


Given that traffic from rail cars and construction equipment occurred within a limited area, the logistics at the 
staging area could be packed and complicated. Therefore, pedestrian workers must receive sufficient realistic 
safety training to train to work in such an environment. To test the proposed framework, we recorded a task where 
two construction workers moved materials from a storage area to a rail wagon. Figure 2 illustrates the scenario: 
One worker operates a forklift, and the other worker assists the equipment operator. The work lasted for 90 minutes. 
The pedestrian worker carries an RTK-GNSS module, and the forklift has a module mounted on the roof of the 
forklift, centrally placed on top of the operator's seat. 
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Fig. 2: The forklift moves material from the loading area to a railway wagon while a pedestrian worker assists. 


Figure 3a shows the trajectories of the two resources during the 90 minutes. The RTK-GNSS rovers streamed the 
location data to a database with a frequency of 5Hz for the machine and 1Hz for the worker. During that time, the 
worker supported the equipment operator by attaching the material bags to the forklift in the loading area and 
removing them in the unloading area. Figure 3 shows that frequent interactions between workers and equipment 
were inevitable when working simultaneously in a limited area. The 3D plot in Figure 3b visualizes the 2- 
dimensional movements of the forklift and the worker over time within the loading area to stress the close 
interaction between the forklift and the worker. This visual eases the spotting of proximity events, which entail 
that the worker was too close to the forklift. In this study, we defined too close proximity once a worker enters a 
bounding box with a one-meter distance to each side of the forklift. We ran an analysis based on an existing 
approach to detect such proximity events (Golovina et al., 2019b). The worker entered the 1m bounding box 28 
times during the time. We will use this performance as a reference for the trainees in our training scenarios, but 
the results also indicate that this framework is also applicable for safety monitoring or assessment of construction 
resources as the game engines provide large libraries for the demand of the previously mentioned applications. 


== Forklift © Loading area @ Proximity 


== Pedestrian worker —— Unloading area events 


Fig. 3: (a) 2D-trajectories of the forklift and the worker during a work task over 90 minutes with 28 proximity 
events, and (b) the trajectories, including the time-axis limited to the unloading area for 15 minutes. 


4.2 From reality to virtuality: RTLS data processing 


The collected data consists of a set of locations (x,y,z) with the corresponding timestamps. The locations refer to 
the coordinate system WGS84, a common GNSS localization system. As the Unity scene refers to ETRS89, we 
transformed the WGS84 data into ETRS89 using Pyproj (Pyproj Contributors, 2023). Based on this data, we can 
visualize the individual states for each recorded point in the Unity scene and locate it at the real-world location. 
The second component of data processing connects the individual points and moves the resource coherently with 
a realistic speed, orientation, and motions. For instance, the wheels rotate, or the axle turns once steering. Figure 
4 illustrates a problem: The trajectory from a machine implies that the machine first moved forward, then stopped, 
and returned backward. To include such logic in the framework, we need to make assumptions and technically 
convert them into an algorithm. This specific forklift steers with the rear axle, which must also be considered when 
simulating the motions. We make the following logical propositions: 


Proposition 1: The forklift only moves forward or backward and not sidewards. 

Proposition 2: The forklift changes directions if and only if it is moving. 

Proposition 3: The wheel of the forklift spins if and only if the forklift is moving. 
Proposition 4: A pedestrian worker only walks forward. 

Proposition 5: Distances of less than 10 cm between consecutive points are considered noise. 
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Fig. 4: (a) the tracked forklift, (b) the recorded trajectory, and (c) the directed trajectory after the assessment. 


Some of these assumptions are not precise. For instance, humans can walk backward or sidewards, crawl, and 
jump, but it is rather difficult to detect such behavior purely based on the trajectory of a human. Hence, we limit 
the scope to forwards walking humans. The last assumption was made after the first implementation, where the 
authors noticed that the smoothness of the motions appears faulty when considering short movements. For instance, 
the worker probably standing would constantly change direction and move by a few centimeters. The following 
algorithm integrates these logical statements and moves the resources in the correct direction at a given speed. 


Algorithm 1: Move object 
Input: Resource, CurrentPoint, NextPoint, CurrentDirection 
Distance = |(NextPoint — CurrentPoint)| 
If Distance >= 0.1: 
NextDirection = NextPoint-CurrentPoint 
Duration = NextPoint.Time — CurrentPoint.Time 
Velocity = Distance/Duration 
If Resource is Machine: 
If |AngleBetween(CurrentDirection,NextDirection)| > PI/4 
MoveBackwards(NextPoint, Velocity) 
Else MoveForwards(NextPoint, Velocity) 
10 Else MoveForwards(NextPoint, Velocity) 


OANNDNBPWNH 


The proposed algorithm moves a resource to the next point with realistic speed and rotation. If the next point is at 
least 10 centimeters from the current location, the algorithm determines the required direction and the velocity. 
Depending on whether the resource is a machine or a human, the algorithms move the resource forward or 
backward. The methods MoveForwards and MoveBackwards in lines 7, 9, and 10 implement how the resource 
behaves when moving. Practically, this means that the resource is moved in every frame according to the velocity 
and distance. The methods also implement additional animations such as rotating wheels of the machine or body 
motions for the worker (moving the legs, swinging the arms). 


4.3 Virtuality: Training scene 


The virtual environment was generated in the game engine Unity. We used Unity as it provides a vast selection of 
assets and is simpler for conceptual work, while other game engines like Unreal outperform Unity with the graphics. 
The virtual environment comprises the components illustrated in Figure 5: (1) a landscape, (2) the BIM model, (3) 
additional objects to enhance the realism of the game scene, and (4) the moving resources connected to IoT devices. 


— al 


Fig. 5: Scene components: (a) landscape, (b) BIM model, (c) site equipment, (d) human worker, and (e) forklift. 
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Google's photorealistic tiles were added to the Unity scene using the asset Cesium to visualize realistic 
surroundings. This asset allows for adding geo-referenced objects and utilizes the WGS84 coordinate system, 
which is commonly used in GNSS applications. The tiles from Google shown in Figure 5a have two disadvantages: 
First, it is outdated, and second, the quality is low. For this use case, we consider it sufficient as it is not the focus 
of this study. A frequently updated mesh generated from a laser scan or photogrammetry may increase realism and 
ensure up-to-date surroundings. In the next step, we added the BIM model in the form of an IFC file. We envision 
the digital twin to provide BIM models of high quality. In this case study, a BIM model with few elements and a 
low level of development was available. The provided BIM model contains the physical structure of the site (see 
Fig. 5). A modified version of the Unity Asset [fclmporter was used to convert the IFC file into a Unity-readable 
format. The provided IFC file references the geospatial coordinate system ETRS89. Hence, the origin of the BIM 
model was converted into WGS8¢4 to place it correctly within the landscape model. The local origin of the Unity 
scene still relates to the ETRS89 origin of the BIM model. In this way, we refer to a Cartesian coordinate system 
and no longer to the geographic coordinate system WGS84, which eases simulating the resource. In the third step, 
the scene obtained additional objects to make the environment more realistic. Based on a site visit and a site layout 
plan, we added elements such as an office trailer, safety guardrails, or material storage. The site layout plan defines, 
among others, safe paths for the workers and spaces for the machines to operate. In the last step, the tracked 
resources are added, and the movements will be simulated based on the collected trajectories and the proposed 
algorithm in the previous sections. The moving resources represent the hazard for the workers, and as they are 
following the real-world data, the simulation is more realistic. 


5. EVALUATION 


The introduction described the problem of this research: Generating realistic scenarios for construction safety 
requires realistic hazards and realistic surroundings. Our framework proposes the use of RTLS data for integrating 
scenarios based on real tasks. The framework was implemented for the described construction site and tested with 
the 90 minutes data sample from the previous section in two training scenarios: (1) a training experiment with a 
student to validate that the created hazards are more realistic, and (2) collaborative tasks where the trainee assists 
the forklift. Before describing the training scenarios, an accuracy assessment evaluated the algorithm, simulating 
the motions of the resources. 


5.1 Assessment of simulated data 


We evaluate the accuracy of the RTLS data integration based on two indicators. First, we measure the deviation 
from the simulated data to the collected data in the real world. Second, we visually assessed the simulated data 
and compared it to a video recording. 


One of the main objectives of this work was to simulate hazards realistically using RTLS data. RTK-GNSS 
provides reliable and accurate data. However, the filtering algorithm processes the raw data to visualize motions 
coherently. This can generate misplaced hazards. Hence, the filtering algorithm was evaluated by comparing both 
the virtual trajectory to the real trajectory. For collecting the virtual trajectory, a Unity script streamed the location 
of the resource with 10Hz to a database. The virtual trajectory was then compared to the real trajectory. As the real 
trajectory was collected with 5Hz and 1Hz for the forklift and the machine, respectively, the time-wise closest 
point from the virtual trajectory was compared to the real point. There is already a little error in this comparison 
as the closest point can be up to 100ms apart. With a maximum speed of 30km/h, this can contribute to inaccuracy 
of up to 8.5cm. 


Table 1 summarizes the distribution of the deviation for both the worker and the forklift. The algorithm for the 
worker provides accurate results. With a standard deviation of 5.9cm and a 99 percentile of 29cm, the performance 
is very good. However, it is important to stress that the data was collected with 1Hz. The implemented algorithm 
aims to simulate the movements towards a given point. In between these points, we do not have evidence of what 
happened. Within this second, we do not know whether the worker turned around. Thus, data should be collected 
with a higher frequency in order to evaluate the realism of the simulation better. The accuracy of the simulation 
for the forklift is more meaningful for two reasons: The real-world data was collected with a higher frequency, and 
the algorithm moves the resource on interpolated paths, which entails a higher deviation. The mean deviation 
amounts to 13cm, and the standard deviation is 27cm. The median deviation amounts to 6.8cm, and the forklift 
was at least 40cm accurate during 95% of the time. The inaccuracy of the simulation relates to the low frequency 
of the collected data points. The authors conclude that a higher 30-100Hz frequency will enable a more reliable 
simulation. 
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Table 1: Deviations between the real trajectory from the RTLS data and the virtual trajectory. 


Resource Mean Standard Median 95 percentile 99 percentile 
Forklift 13cm 27cm 6.8cm 40cm 46cm 
Worker 10cm 5.9cm 7.6cm 21cm 29cm 


The second measure to evaluate the simulation was conducted manually. We compared 10 minutes of the 
simulation to a 7-minute video recorded on-site simultaneously. The video reveals that twice during the 10 minutes, 
the forklift was simulated moving backward while it was actually driving forward. This happened during a total 
time of 28 seconds, corresponding to 6.2%. Comparing the movements of the worker compared to the simulation 
seemed realistic. 


5.2 Training Scenario 1: Simultaneous tasks 


As we mentioned previously, virtual construction safety training requires personalized feedback. Research has 
proposed methods for collecting such data and can visualize it in a concise way for construction workers. In this 
research, we use the concept of safety parameters and collect data from the trainees when entering hazards using 
automatic data collection. In the same way as before with the worker in Section 4.1, a proximity event is triggered 
once the trainee enters the bounding box around the forklift. 


In the first training scenario, the trainee must collect various objects in the training scene and return them to a 
storage area. Meanwhile, the forklift and the pedestrian worker will follow the trajectories from the real-world 
data collection. During the training, the trainee needs to ensure that they will not trigger any hazards relating to 
the forklift. Figure 6a indicates such a situation: The trainee needs to cross while the forklift passes. Should the 
trainee get too close to the forklift, a close call is triggered, and data will be collected, which is processed for 
personalized feedback. The created training scenario lasts 10 minutes, where the trainee needs to collect seven 
objects. Figure 6b shows the results. The trainee crossed the road six times while the forklift was nearby. The 
visual indicates that the worker always identified the forklift while heading east. However, returning, the trainee 
was very close to the forklift twice. This data allows us to conclude that either the equipment operator should have 
more distance to the pedestrian cross or that the trainee could not see the forklift. Figure 6b also depicts the three 
proximity events of the real-world worker during this 10-minute excerpt. 
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Fig. 6: (a) Results of the first training scene, including the landscape, BIM model, resources, and trajectories 
from IoT devices, and (b) the trainee crossing the road while the forklift follows the real-world trajectory. 


5.3 Training Scenario 2: Collaborative task 


In the second training scenario, the trainee takes over the role of the worker. Hence, the pedestrian worker is 
removed from the game, and the trainee is advised to support the equipment operator in loading the forklift. The 
trainee must wait for the forklift and assemble the boxes by pressing "C" on the keyboard. To ensure the worker is 
safe, the trainee is advised to wait in highlighted areas to not collide with the machine. Figure 7b illustrates the 
safe area, and Figure 7a shows the results from the 90-minute task. During the 90 minutes, the worker followed 
the forklift to assist in transporting the boxes. We collected the data about the proximity with the forklift using the 
1m bounding box described before. When the worker collides with the bounding box, a proximity event is triggered. 
Figure 7a shows the 37 collisions with the bounding box in yellow and highlights three actual hits in red. The 
figure indicates that the actual hits occurred not in the loading area but when the forklift was approaching the 
worker while the worker walked to the unloading area. The forklift was emitting sound, and the game engine 
increased the volume when coming closer to the sound source. Still, the worker did not avoid the equipment. 
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However, it is likely that in real-world situations, the equipment operator would have stopped. These collisions are 
not realistic and reveal one major disadvantage of this approach: The machine is following the trajectory without 
reasoning. Hence, future work needs to investigate how to include such reasoning as the machine would most 
likely avoid the machine. Comparing the virtual performance of the trainee to the real-world worker, the trainee 
triggered more proximity events. There can be different reasons, but all yellow proximity events were triggered at 
the front. A reason could be that the forklift was stubbornly following the trajectory from the data collection and 
did not interact with the worker. There was no communication possible between the forklift operator and the 
construction worker. Hence, if the worker, for instance, did not finish, yet, mounting the box, the forklift will yet 
continue on the route and almost hit the worker. The authors propose two approaches to tackling this issue. First, 
the forklift may include intelligence, such as avoiding the worker while driving or waiting for the trainee to finish 
their task before moving. Second, a multiplayer game where another trainee takes over the role of the equipment 
operator could enhance realism and improve collaboration between the workforce. 


=== Player 
== Forklift 
© Safe waiting area 
@ Within 2 m cylinder 
_ Within 1 m bound box 


Fig. 7: The (a) results from Training Scenario 2 indicating the trajectories and proximity events, and (b) safe 
waiting areas where the trainee can safely wait for the return of the forklift until its standstill. 


6. CONCLUSIONS AND FUTURE WORK 


In this paper, we addressed the challenge of creating realistic virtual training scenarios for construction safety. 
Despite the advancements in virtual learning, the integration of real-world data from IoT devices was largely 
missing, hindering the level of realism regarding hazardous situations. Our proposed framework leverages RTLS 
data to enhance the extent of world knowledge and coherence of VTEs, resulting in more realistic and contextually 
relevant experiences as the hazards relate to real-world scenarios. 


Through a case study conducted at a construction site in Munich, Germany, we validated the effectiveness of our 
framework. The integration of RTLS data allowed us to accurately represent the movements of construction 
workers and equipment within the virtual learning environment for safety training purposes. The data processing 
algorithms and logical propositions ensured realistic motions of the resources, further enhancing the coherence of 
the virtual environment. Additionally, we demonstrated the practicality of the proposed method by creating a 
realistic training scenario involving hazardous interactions between construction workers and equipment. The 
study also indicates potential in creating training scenarios for collaborative tasks between humans and equipment 
based on real-world data. 


Nevertheless, the framework requires a more responsive simulation where the equipment not only follows a real 
path but can also stop and continue based on the trainees’ behavior and feedback or avoid them when having clear 
sight. It also raises further research questions, for example, whether it would be better to create multiplayer games 
for collaborative work tasks rather than making equipment follow realistic but, eventually, for human learners, 
predictable travel routes. As work is underway, expanding our framework to support multiple trainees interacting 
and collaborating within the virtual environment would foster a more dynamic and engaging training experience, 
mirroring real-world construction sites' teamwork and coordination. 


This preliminary work successfully bridged the gap in virtual training for construction safety by integrating real- 

world data from IoT devices, but there are several avenues for future improvements: RTLS data should be recorded 

at a higher frequency than 1Hz, and additional sensors on relevant static or dynamic objects in the scenery could 

further enhance the realism. For the forklift, additional sensors could detect the vehicle's orientation or the fork's 

exact location and extension. This would ease the simulation of movements. In addition, a Body Motion Suit (BMS) 
for the construction worker may provide more information on how the worker executes a specific task, adding a 

high level of perhaps needed detail of relevance to some construction hazards. 
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In conclusion, our preliminary research efforts contribute to the advancement of virtual training for construction 
safety by leveraging IoT data to create more realistic and coherent training scenarios. As we continue to explore 
and refine the proposed framework, it has the potential to significantly improve safety awareness and reduce 
occupational injuries and fatalities in the construction industry. 
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